Implementing QR factorization updating algorithms on GPUs

نویسندگان

  • Robert Andrew
  • Nicholas J. Dingle
چکیده

Linear least squares problems are commonly solved by QR factorization. When multiple solutions have to be computed with only minor changes in the underlying data, knowledge of the difference between the old data set and the new one can be used to update an existing factorization at reduced computational cost. This paper investigates the viability of implementing QR updating algorithms on GPUs. We demonstrate that GPU-based updating for removing columns achieves speed-ups of up to 13.5x compared with full GPU QR factorization. Other updates achieve speed-ups under certain conditions, and we characterize what these conditions are.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

QR Factorization Based Blind Channel Identification and Equalization with Second-Order Statistics

Most eigenstructure-based blind channel identification and equalization algorithms with second-order statistics need SVD or EVD of the correlation matrix of the received signal. In this paper, we address new algorithms based on QR factorization of the received signal directly without calculating the correlation matrix. This renders the QR factorization-based algorithms more robust against ill-c...

متن کامل

Design Space Exploration for GPU-Based Architecture

Recent advances in Graphics Processing Units (GPUs) provide opportunities to exploit GPUs for non-graphics applications. Scientific computation is inherently parallel, which is a good candidate to utilize the computing power of GPUs. This report investigates QR factorization, which is an important building block of scientific computation. We analyze different mapping mtheods of QR factorization...

متن کامل

A scalable approach to solving dense linear algebra problems on hybrid CPU-GPU systems

Aiming to fully exploit the computing power of all CPUs and all GPUs on hybrid CPU-GPU systems to solve dense linear algebra problems, we design a class of heterogeneous tile algorithms to maximize the degree of parallelism, to minimize the communication volume, as well as to accommodate the heterogeneity between CPUs and GPUs. The new heterogeneous tile algorithms are executed upon our decentr...

متن کامل

LAPACK-Style Codes for Pivoted Cholesky and QR Updating

Routines exist in LAPACK for computing the Cholesky factorization of a symmetric positive definite matrix and in LINPACK there is a pivoted routine for positive semidefinite matrices. We present new higher level BLAS LAPACK-style codes for computing this pivoted factorization. We show that these can be many times faster than the LINPACK code. Also, with a new stopping criterion, there is more r...

متن کامل

Implementing Communication-optimal Parallel and Sequential Qr Factorizations

We present parallel and sequential dense QR factorization algorithms for tall and skinny matrices and general rectangular matrices that both minimize communication, and are as stable as Householder QR. The sequential and parallel algorithms for tall and skinny matrices lead to significant speedups in practice over some of the existing algorithms, including LAPACK and ScaLAPACK, for example up t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Parallel Computing

دوره 40  شماره 

صفحات  -

تاریخ انتشار 2014